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COMPARISON OF WHEAT CLASSIFICATION ACCURACY USING DIFFERENT 
CLASSIFIERS OF THE IMAGE-100 SYSTEM 

Sherry Chou Chen 
MaurTcio Alves Moreira 
Angela Maria de Lima 

Instituto de Pesquisas Espaciais - INPE 
Conselho Nacional de Desenvolvimento CientTfico e Tecnologico - CNPq 
Sao Jose dos Campos - SP - Brasil 

ABSTRACT 

This paper compared wheat classification results using 
Single-cell and Multi-cell Signature Acquisition Options, a point-by- 
point Gaussian maximum-likelihood classifier, and K-means clustering 
of the Image-100 system. Each classifier was. used to distinguish wheat 
from non-wheat in Cruz Alta, Rio Grande do Sul State, Brazil. Indepen- 
dent training and test areas (each area * 20 pixels) were used in 
classification procedures. In order to give a more realistic view of the 
classification performance, a test area of approximately 40 km 2 , with a 
variety of land cover types, was also selected. The rescaled alphanumeric 
theme print of each classifier was overlaid on IR aerial photographs. A 
point-by-point comparison of the theme print to its corresponding aerial 
photographs, provided the percentages of correct classification and 
error of commission. The study results show that using small test areas 
of one cover type to evaluate classification performance may lead to an 
optimistically high percent correct classification. In addition, percent 
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correct classification should, not be the only factor used for evaluation. 
The error of commission plays an Important role In the estimate of area, 
which Is generally the main objective for crop inventory study. Among 
the examined classifiers, the point-by-point Gaussian maxi mum- likelihood 
classifier, using four spectral subclasses of wheat, shows the best 
performance. This classifier gave an 87.3% correct classification, a 
12.9% commission error, and an overestimate of £)./-* in wheat area 
when compared to that obtained from aerial photographs. 
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INTRODUCTION 


For correct crop identification using LANOSAT multi- 
spectral scanner data and computer-aided analysis, the distinguishable 
spectral response of this crop type has to be defined. There are several 
classifiers, based on different classification criteria, available in the 
Image-100 system of the Brazilian Institute for Space Research (INPE). 

All these classifiers can b* ur?d to derive statistics of the spectral 
responses for study classes. Thus, one of the problems conwionly 
encountered by a remote sensing data analyst is to decide which classifier 
Is the best to use. In order to obtain an accurate crop area estimate, 
the selected classifier should provide not only the maximum correct 
classification but also the minimum error of commission. 

In this study, classification performances on wheat of one 
unsupervised and three supervised classifiers of the Image-100 system*, 
were compared. Based on the study results, the optimal classifier will 
be selected for an on-going crop forecasting project. Pue to time 
constraints, in this study qualitative comparison of classification 
results was made on the whole study area (» 400 km 2 ), while detailed 
quantitative analysis of point-by-point comparison was carried out in 
an intensive test area of 40 km 2 . 


* Image-100 is an interactive image analyser marketed by General 
Electric Co. to analyze MSS data. 
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STUDY AREA AND DATA ACQUISITION 

Cruz Alta is one of the major municipals for wheat 
production in Rio Grande do Sul State, Brazil. The geographic location 
of this municipal is around 28°35'$ and 53°45'W. An area in Cruz Alta 
of approximately 400 km 2 , which represents the wheat plantation of the 
state, was selected as study area (Fig. 1). In this region, depending 
on climatic conditions, wheat may be planted in April or Hay and be 
harvested in October and November. For the crop year 1979, the wheat' 
planting area in April decreases 60 % in comparison to the same period 
of 1978. This decrease is attributed to a dry spell in April and some 
changes in the financial system which made farmers reluctant to plant. 
Intensive planting was only initiated in late May when these financial 
changes were lifted and a 100% loan was available to farmers (1). Wheat 
calendar, with a planting season in late May, is presented in Fig. II. 

a) Aircraft Data Acquisition 


On September 4, 1979, a cloud-free day, INPE's aircraft 
Bandeirante was flown over the study area and color infrared (CIR) aerial 
photographs of medium scale (1:20,000) were taken using RC-10 photo- 
grammetric camera. These aerial photographs were visually interpreted 
and served as reference data for wheat classification using LANDSAT data 
and Image- 100 system. 



Fig. 1 - Map showing the study area in Cruz Alta, Rio Grande 
do Sul State. 



MAY 

. 

JUN 

JUL 

AUG 

SEP 

OCT 

NOV 


PLANTING 


EMERGENCE 


JOINTING 


HEADING 


flowering 


SOFT DOUGH 


HARVESTING 



Fig. 2 - Crop calendar of wheat for Rio Grande do Sul 
State (1979). 
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b) LANOSAT Data Acquisition 

For the crop year 1979, LANOSAT data acquisition at the 
end tf September or in the beginning of October would be the Ideal pass 
for wheat discrimination. This is due to the fact that in September/ 
October, wheat had matured and turned to a golden-yellow color which 
was significantly different from the surrounding crops (predominantly 
pasture) that were still green. However, the 100* cloud cover of LANDSAT 
data on September 22nd, prohibited its utilization. LANDSAT digital data 
acquired on Sept. 4th was substituted and used for this study. The 
path/row annotation of these digital data is 220/32. 

ANALYSIS PROCEDURES 

There are various classifiers available for INPE's Image-100 
system to perform analysis of remotely sensed data. The classifiers 
selected for this study were: 

1) Single-cell Signature Acquisition Option: This is a supervised 
classification procedure. Once the training areas for an 
Informational class are selected by the analyst, the limits of 
spectral responses of these training areas are used to create a 
four-dimensional rectangular parallelepipe; each side of which 
corresponds to the range of spectral response In each channel. 

An unknown pixel (picture element) is classified to this 
Informational class if the spectral responses of four channels 
fall into the parallelepipe. 
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2) Multi -cell Signature Acquisition Option: In this mode of 
operation,, the single-cell approximation Is subdivided into 
many smaller cells. Each pixel of the training area may form 
a cell which occupies a discrete, known region in spectral 
space. In this study, the threshold was set to zero, i. «., 
only empty cells were discarded. 

3) MAXVER: Is a supervised, point-by -point Gaussian maximum- 
likelihood classifier implemented at INPE for the Image-100 
sysfr .. Ibis classifier has the capacity of analyzing 32 
classes with a maximum of 160 training areas. Detailed 
description of this classifier can be found in Velasco et aVe 
paper (2). 

4) K-means classifier: This is an unsupervised clustering function. 
In this operation the analyst has little control over the 
establishment of the decision region. Spectral information of 
the randomly selected training areas are clustered into several 
homogeneous spectral classes. These spectral classes must 
eventually be converted to informational classes by identifying 
the ground cover which corresponds to each spectral class. 

For supervised classification, fourteen wheat fields, 
three pasture fields and two plowed fields were used as training areas 
and carefully located on the image monitor using an electronic cursor. 

In analyses of Single-cell and Multi -cell Options, only the training 
areas of wheat were required. For MAXVER classification two training 
methods were used: a) employing all fourteen training areas to form 
unique training statistics for one wheat class; and b) these fourteen 
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training areas of wheat were divided Into four subclasses according to 
their tonality differences on CIR aerial photographs, then, training 
statistics for each subclass of wheat were derived. Generally speaking, 
spectral response of a crop In a large and heterogeneous area may vary 
considerably due to the differences in crop variety, phenologlcal stage, 
soil type, moisture content, agricultural practice, etc. These hetero- 
geneities in spectral responses may not satisfy the assumption of 
Gaussian distribution required for the maxi mum- likelihood classifier. 
These two approaches for obtaining training statistics were used to test 
the effect of training method on classification accuracy of MAXVER. 

Training areas of unsupervised K-means classification 
were randomly selected and the number M twelve" was assigned as the initial 
number for clustering. However, results using twelve cluster centers were 
too complex to associate with informational classes. After several 
Iterations, eight centers were used for analysis. For purpose of 
comparison, another training method using the same training areas 
employed in supervised classification were also Included for K-means 
analysis. The classification approach of MAXVER with unique wheat class, 
MAXVER with four subclasses of wheat, K-means using random training 
areas, and K-means using the same training areas as in the supervised 
classifications, are addressed hereinafter as MAXVER-a, MAXVER-b, 

K-means (a) and K-means (b), respectively in this article. Once statistics 
were obtained from training areas, the classification accuracy of each 
classifier was examined on independent test areas (each area with* 20 
pixels) which contained pixels of one cover type. This method of using 
Independent sets of one-cover-type areas for training and testing was 
widely employed to evaluate classification results (3). However, the 
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authors feel that testing a classifier on snail areas with one cover 
type may lead to an optimistically high percentage of correct classifi- 
cation due to the simplicity of the test areas. Thus* In order to give 
a more realistic view of classification accuracy, an Intensive test 
area of 40 km 2 , with various cover types and representing the complexity 
of the study area, was also classified using each classification approach. 
For a detailed quantitative analysis, the alphanumeric theme print 
(1:20,000) of the intensive test area, using each classifier, was over- 
laid on CIR aerial photographs of the same scale. Boundaries of each 
cover type were then locally fitted on a printout to correct relief 
displacement errors using plowed fields as control points. After 
boundary delineation, the correctly classified points were counted to 
assess the classifier effect on percent correct classification of wheat, 
other cover type and overall (wheat and other cover types as a whole). 

The proportion of other cover types, which was erroneously classified 
as wheat, was also calculated and designated as commission error. 
Estimated wheat area, using each classifier and the Image-100 system, 
was compared to that obtained from aerial photographs. All of these data 
are presented in tabular form. 

After comparisons were made on test sites, the training 
statistics of each classifier were applied to the whole study area. 
Classification results of each classifier were displayed on color CRT 
of Image-100 system In thematic format. Slides of these results were 
taken and visual comparisons were made to assess classification differ- 
ences among classifiers. 

RESULTS AND DISCUSSION 

Among the various classifiers which used the same 
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training areas to classify wheat. Multi-cell Option gave the lowest 
percentage of correct classification. The unacceptable low correct 
classification Is due to the fact that only a small fraction of wheat 
pixels were used for training (Table I). This relatively low pixel 
number caused the unit cells to be sparsely distributed In the four- 
dimensional spectral space; where many empty cells may actually 
represent wheat but did not have any pixel In the space. A lot of 
wheat pixels were thus omitted fromldentlflcatlon. The extremely bad 
performance of Multi-cell Option Is excluded from presentation. 

Table II shows the correct classification of various 
classifiers In small test areas of one cover type. As expected, most of 
the classifiers had an almost perfect performance. Differences among 
classifiers were only revealed by comparing the point-by-point classifi- 
cation results In the Intensive test area. All classifiers, except 
K-mean$ (b), have good capabilities to Identify wheat (Table III). The 
lowest percentage of correct classification observed In K-mean$ (b) 
Indicates that some wheat spectral responses were not defined by clustering. 
This is because 192 wheat pixels were Insufficient for clustering where a 
characteristic wheat spectral response with a low pixel frequency may not 
be used as a center of cluster. Hence, one-fourth of the wheat pixels 
were not classified as wheat. The classifier effects on correct classifi- 
cation of wheat, other cover types, and overall were evaluated using 
analysis of variance on the arcsine transformed data of Table III. No 
statistically significant difference on correct classification among 
classifiers was found. However, besides correct classification, commission 
error is also an Important factor In evaluation of classification perfor- 
mance. A classifier, with a high percent correct classification for a 
given study class and a high commission error, may perform as badly as 


COVER TYPE 


NO. OF TRAINING FIELDS 


PIXEL NO 


i 

i 


WHEAT 

PASTURELAND 
PLOWED AREA 


14 

3 

2 


192 

108 

24 


TOTAL 


324 


j 


TABLE II 

PERCENTAGE OF CORRECT CLASSIFICATION IN ONE-COVER-TYPE TEST 
AREA FOR SEVERAL CLASSIFIERS 


CORRECT CLASSIFICATION % 


CLAiblUtK 

WHEAT 

PASTURELAND 

PLOWED AREA 

SINGLE CELL 

98.3 

m 

mm 

MAXVER (a) 

100.0 

100.0. 

100.0 

MAXVER (b) 

99.0 

100.0 

100.0 

K-HEANS (a) 

8S.7 

97.2 

100.0 
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•TABLE III 


POINT-BY-POIHT COMPARISON OF CORRECT CLASSIFICATION IN AN INTENSIVE 
TEST /.REA WITH MORE THAN ONE COVER TYPES USING SEVERAL CLASSIFIERS 


CLASSIFIER 

CORRECT CLASSIFICATION % 

WHEAT 

OTHER 

OVERALL 

SINGLE-CELL 

88.1 

81.6 

85.0 

MAXVER (a) 

84.6 

82.5 

83.6 

MAXVER (b) 

87.3 

85.3 

84.5 

K-MEANS (a) 

88.7 

79.4 

84.5 

K-MEANS (b) 

75.3 

89.8 

81.9 


TABLE IV 


POINT-BY-POINT COMPARISON OF CLASSIFICATION PERFORMANCE 
FOR SEVERAL CLASSIFIERS 


CLASSIFIER 

CLASSIFICATION PERFORMANCE % 

CORRECT 

CLASSIFICATION 

ERROR OF 
COMMISSION 

RELATIVE 1 
DIFFERENCE 

SINGLECELL 

88.1 

15.8 

-4.5 

MAXVER (a) 

84.6 

14.8 

-0.7 

MAXVER (b) 

87.3 

12.9 

40.2 

K-MEANS (a) 

88.7 

16.0 

+5.6 

K-MEANS (b) 

75.1 

10.7 

-15.7 


1 Relative difference was obtained by comparing area estimates from 
Image-100 and aerial photographs. A negative sign indicates under 
’ estimate, while a positive sign indicates over-estimat in wheat 
area by the Image-100 system. 
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those classifiers having low percent correct classifications. Table IV 
summarizes the correct classification, error of commission and the 
relative difference of wheat area estimates for each classifier. 
Commission error ranges from 10.7 to 16. OX for all tested classifiers. 
The amplitude of this error indicates that there Is a certain amount of 
confusion between spectral responses of wheat and pasture. This is 
because wheat was in heading/flowering stage in September and presented 
a similar response to some well-established pastureland. MAXVER-b, which 
gave an 87. 3 % correct classification, a 12.9% commission error and the 
smallest relative difference in area estimates (+ 0.2X), seems to be the 
most proper classifier for wheat area estimate in the test area. Perfor- 
mance of MAXVER-b was also the best in the whole study area (Fig. Ill) 
by comparing slides, where a thematic map of each classifier was shown. 

The significant results obtained from this study for 
wheat classification are: 

- A better indication of correct classification can be prov’led 
by using a test area which contains various cover types of the 
study area. 

- Classification accuracy should be evaluated considering both the 
percentages of correct classification and error of commission. A 
higher percent correct classification for a given classifier can 
be obtained by a trade-off with the percent correct classifica- 
tions of other cover types. Consequently these lowered percent- 
ages in other cover types mav lead to a high commission error 
which is not desirable for area estimate either. 

- Supervised classification approaches are better than K-means 



Fig. Ill - Classification results in the study area using MAXVER-b 
classifier. 



m * 




15 


. clustering. 

- Gaussian distribution maximum-likelihood classifier Is better 
than Single-cell and Multi-cell Signature Acquisition Options 
of the Image-100 system. 

- In order to obtain a high classification accuracy In a large 
and heterogeneous crop area, using Gaussian maximum-likelihood 
classifier, homogeneous spectral subclasses of the study crop 
should be created to derive training statistics. 
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